HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins.

نویسندگان

  • C Bystroff
  • V Thorsson
  • D Baker
چکیده

We describe a hidden Markov model, HMMSTR, for general protein sequence based on the I-sites library of sequence-structure motifs. Unlike the linear hidden Markov models used to model individual protein families, HMMSTR has a highly branched topology and captures recurrent local features of protein sequences and structures that transcend protein family boundaries. The model extends the I-sites library by describing the adjacencies of different sequence-structure motifs as observed in the protein database and, by representing overlapping motifs in a much more compact form, achieves a great reduction in parameters. The HMM attributes a considerably higher probability to coding sequence than does an equivalent dipeptide model, predicts secondary structure with an accuracy of 74.3 %, backbone torsion angles better than any previously reported method and the structural context of beta strands and turns with an accuracy that should be useful for tertiary structure prediction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting interresidue contacts using templates and pathways.

We present a novel method, HMMSTR-CM, for protein contact map predictions. Contact potentials were calculated by using HMMSTR, a hidden Markov model for local sequence structure correlations. Targets were aligned against protein templates using a Bayesian method, and contact maps were generated by using these alignments. Contact potentials then were used to evaluate these templates. An ab initi...

متن کامل

Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions

MOTIVATION In recent years, advances have been made in the ability of computational methods to discriminate between homologous and non-homologous proteins in the 'twilight zone' of sequence similarity, where the percent sequence identity is a poor indicator of homology. To make these predictions more valuable to the protein modeler, they must be accompanied by accurate alignments. Pairwise sequ...

متن کامل

Five hierarchical levels of sequence-structure correlation in proteins.

This article reviews recent work towards modelling protein folding pathways using a bioinformatics approach. Statistical models have been developed for sequence-structure correlations in proteins at five levels of structural complexity: (i) short motifs; (ii) extended motifs; (iii) nonlocal pairs of motifs; (iv) 3-dimensional arrangements of multiple motifs; and (v) global structural homology. ...

متن کامل

Remote homolog detection using local sequence-structure correlations.

Remote homology detection refers to the detection of structural homology in proteins when there is little or no sequence similarity. In this article, we present a remote homolog detection method called SVM-HMMSTR that overcomes the reliance on detectable sequence similarity by transforming the sequences into strings of hidden Markov states that represent local folding motif patterns. These stat...

متن کامل

The Grammatical Structure of Protein Sequences in a Single Non-linear Hidden Markov Model; Prediction of Coding Regions, Secondary and Tertiary Structure

We describe a hidden Markov model, HMMSTR, for general protein sequence based on the I-sites library of sequence-structure motifs [1]. Unlike the linear HMMs used to model individual protein families, HMMSTR has a highly branched topology and captures recurrent local features of protein sequences and structures that transcend protein family boundaries. The model extends the I-sites library by d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of molecular biology

دوره 301 1  شماره 

صفحات  -

تاریخ انتشار 2000